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SUMMARY 


This  report  examines  a method  to  correct  for  the  effect  of  range  restriction  on 
correlation  coefficients.  Often,  it  is  necessary  to  estimate  the  correlation  of  two 
variables  in  a large,  diverse  population  from  a smaller,  selected  population  in  which 
the  ranges  of  the  variables  have  been  restricted.  Range  restriction  (sometimes  called 
curtailment)  occurs  whenever  there  is  a real  change  of  variance  in  a particular  variable 
in  the  selected  population.  A direct  calculation  of  the  correlation  coefficient  for  the 
larger  group  from  the  data  sample  of  the  smaller  group  is  misleading.  To  more 
accurately  estimate  the  true  correlation,  methods  of  correcting  for  range  restriction 
have  been  developed. 

The  method  described  in  this  report  is  one  of  the  most  general.  It  handles  the  case 
in  which  the  sample  population  has  been  directly  restricted  in  several  variables.  All 
coefficients  are  corrected  simultaneously,  so  that  an  entire  correlation  matrix  can  be 
corrected.  This  report  gives  the  correction  equations,  and  the  assumptions  needed  to 
derive  those  equations.  A FORTRAN  program,  used  to  compute  the  corrected  correla- 
tion coefficients,  is  given  in  appendix  A;  an  APL  program  in  appendix  B. 

The  correction  method  is  illustrated  using  the  aptitude  test  scores  of  Marines 
^elected  for  formal  training  and  similar  test  data  for  60,000  FY-1975  Marine  recruits. 
An  empirical  test  for  the  effectiveness  of  the  equations  is  given.  In  addition,  another 
technique  for  correcting  for  range  restriction  is  discussed  and  compared.  In  general, 
the  method  documented  in  this  report  appeared  superior  to  the  other  examined. 
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INTRODUCTION 


Range  restriction  is  a problem  that  is  often  encountered  in  correlation  analysis. 
Recall  that  the  correlation  coefficient  r quantifies  the  extent  that  two  variables  covary 
in  a particular  population.  It  is  misleading  to  speak  of  the  correlation  between  two 
variables  without  specifying  the  sample  population,  since,  generally,  the  size  of  r is 
related  to  the  ranges  of  the  correlated  variables  in  the  measured  population.  Usually, 
the  correlation  coefficient  computed  from  a population  in  which  the  ranges  of  the  variables 
have  been  restricted  will  be  smaller  than  the  r computed  from  a broader,  unrestricted 
population.  Since  it  is  often  desirable  to  estimate  the  correlation  of  two  variables  in 
a large  population  from  data  obtained  from  a more  restricted  population,  it  is  necessary 
to  correct  the  correlation  coefficients  computed  in  the  smaller  population  for  the  effects 
of  range  restriction. 

Suppose,  for  example,  that  a group  of  people  are  given  intelligence  test  A,  and  that 
only  those  who  score  above  90  are  administered  test  B.  Thus,  the  results  of  test  A are 
used  to  restrict  the  group  who  take  test  B.  It  is  often  of  interest  to  find  how  well  test  A 
predicts  performance  on  test  B.  That  is,  what  is  the  correlation  between  tests  A and  B 
for  the  total  group?  Because  the  group  that  took  both  tests  A and  B was  restricted  on  the 
basis  of  performance  on  test  A,  that  question  cannot  be  answered  by  direct  calculation  of 
the  correlation  coefficient.  The  problem  would  be  even  more  complex  if  several  tests 
were  the  basis  of  restriction.  This  report  examines  a method  of  correcting  for  range  re- 
striction that  can  handle  the  problem  of  multiple  curtailment.  Another,  simpler  method 
will  also  be  discussed  and  compared. 


CORRECTION  FOR  MULTIPLE  CURTAILMENT 


The  problem  of  range  restriction  often  occurs  when  the  assignment  of  personnel  to 
jobs  or  schools  is  based  on  test  performance.  The  validity  of  a test  is  measured  by  how 
accurately  it  predicts  the  later  performance  of  a member  of  the  general,  or  unrestricted, 
population.  A direct  measurement  of  a test's  validity  is  impossible  if  that  same  test  is 
used  for  personnel  selection,  since  performance  measures  exist  only  for  the  subset  of 
the  general  population  selected  for  a job.  In  order  to  estimate  accurately  the  validity  of 
a test  in  the  general  population,  the  correlation  coefficients  calculated  from  the  selected 
population  must  be  corrected  for  the  effects  of  range  restriction. 

A recent  study  of  Marine  Corps  school  performance  (reference  1)  provides  an  example 
of  the  problem  of  range  restriction,  and  will  be  used  throughout  this  report  to  illustrate 
how  the  problem  may  be  solved.  At  the  time  of  the  study,  Army  Classification  Battery 
(ACB-61)  of  11  subtests  was  administered  to  all  Marine  recruits  at  the  recruit  depot.  Al- 
though the  range  of  scores  varied  slightly  among  the  different  tests,  the  approximate  range 
was  from  50  to  160,  and  all  the  test  scores  were  approximately  normally  distributed.  A 
number  of  composite  scores  were  computed  from  linear  combinations  of  the  subtests. 

Table  1 lists  the  subtests  and  composites. 

After  raining,  recruits  were  selected  for  assignment  to  jobs  and  formal 

training  ed  on  their  ACB-61  scores.  For  example,  to  be  admitted  into 

the  Sea  ctrination  School,  a Marine  had  to  score  90  or  above  on  his  General 

Technics  (,oi)  test.  By  thus  restricting  the  range  of  GT  scores,  the  variance  of  the 
GT  scores  in  the  selected  school  population  was  reduced.  At  the  end  of  school  training, 
a recruit  was  assigned  a final  course  grade  (FCG)  ranging  from  0 to  100. 

In  order  to  evaluate  how  well  GT  predicts  a recruit's  performance  in  any  course, 
the  correlation  coefficient  between  GT  and  FCG  may  be  computed.  Figure  1 illustrates 
the  problem  of  computing  that  correlation  when  the  ranges  of  the  students'  GT  scores 
have  been  restricted.  In  a typical  school,  if  there  were  no  entrance  requirements,  a 
scatterplot  of  the  students'  FCG  versus  GT  scores  might  look  like  figure  la.  In  contrast, 
figure  lb  shows  what  the  same  scatterplot  would  look  like  if  a GT  score  of  90  or  above 
were  required  for  admission  into  the  school.  In  general,  the  correlation  coefficient 
of  the  restricted  school  population  is  less  than  that  of  the  unrestricted,  general  popula- 
tion. When  several  different  variables  are  used  as  the  basis  of  restriction,  the  problem 
is  even  more  complex. 

Range  restriction  can  occur  from  above  as  well  as  below.  For  example,  if  all  the 
recruits  with  high  GT  scores  are  selected  for  more  demanding  schools,  few  will  be 
available  for  assignment  to  "easier”  schools,  thus  the  range  of  GT  scores  of  students 
in  those  schools  will  have  been  restricted  from  above.  The  curtailment  of  GT  scores 
from  above  will,  in  general,  reduce  the  variance  of  the  distribution  of  scores;  the  corre- 
lation between  GT  and  FCG  will  also  be  reduced. 
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Subtest 

TABLE  1 

ACB-61  SUBTESTS 

Abbreviation 

Verbal 

VE 

Arithmetic  reasoning 

AR 

Pattern  analysis 

PA 

Classification  inventory 

Cl 

Mechanical  aptitude 

MA 

Army  clerical  speed 

ACS 

Army  radio  code 

ARC 

General  information  test 

GIT 

Shop  mechanics 

SM 

Automotive  information 

AI 

Electronics  information 

ELI 

Composite 

ACB-61  COMPOSITE  TESTS 

Abbreviation 

Infantry  Combat 

AR  + 2 Cl 

3 

IN 

Armor,  Artillery,  Combat 

GIT  + AI 

AE 

Engineers 

2 

Electronics 

MA  + 2 ELI 

EL 

General  Maintenance 

3 

PA  + 2 SM 

GM 

Mechanical  Maintenance 

3 

SM  + AI 

2 

MM 

Clerical 

VE  + ACS 

2 

CL 

General  Technical 

VE  + AR 

2 

GT 

General  Classification  Test 

VE  + AR  + PA 

GCT 

3 
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FIG.  1:  AN  EXAMPLE  OF  RANGE  RESTRICTION 
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One  of  the  objectives  of  the  school  performance  study  (reference  1)  was  to  determine 
how  well  the  ACB-61  subtests  and  composites  predicted  each  recruit's  final  course  grade. 
To  accomplish  this,  the  correlation  matrix  of  all  of  the  tests  and  the  FCG  had  to  be 
corrected  for  range  restriction  because  the  ranges  of  one  or  more  of  the  scores  had  been 
directly  restricted  by  the  entrance  requirements  of  each  school.  In  the  remainder  of  this 
report,  school  population  will  be  synonymous  with  the  restricted,  or  curtailed,  population. 

The  data  available  consisted  of  each  student's  FCG  and  his  score  on  each  subtest 
and  composite.  The  scores  of  all  subtests  and  composites  for  60,000  FY-1975  Marine 
nonprior-service  accessions  were  also  known.  The  correlation  coefficients  computed 
from  the  data  on  the  60,000  Marines  were  considered  to  be  t^e  "true"  correlations  in 
the  general  population  (i.e.,  the  general  population  is  defined  to  u all  Marine  recruits). 
The  technique  used  for  correcting  for  range  restriction  was  first  developed  by  Pearson 
(reference  2)  and  later  refined  by  Burt  (reference  3)  and  Lawley  (reference  4).  The 
notation  used  in  this  study  is  Gulliksen’s  (reference  5). 

The  following  information  was  needed  to  use  the  correction  equations: 

(1)  The  correlation  matrix  of  all  the  variables  in  the  restricted  popul  ''ion; 

(2)  The  correlation  matrix  of  the  directly  curtailed  variables  in  the 
unrestricted  population; 

(3)  The  standard  deviations  of  all  the  variables  in  the  restricted 
population;  and 

(4)  The  standard  deviations  of  the  directly  curtailed  variables  in  the 
unrestricted  population. 

Of  the  above,  in  the  case  of  the  84  training  schools,  the  11  ACB-61  subtests  were 
designated  as  the  directly  curtailed  variables.  Items  (1)  and  (3)  were  obtained  on  recruits 
in  formal  schools,  and  items  (2)  and  (4)  came  from  the  data  of  the  60,000  FY-1975 
Marine  accessions. 

Although  no  school  had  entrance  requirements  on  all  11  subtests,  complete  knowledge 
of  the  standard  deviat:ons  and  correlations  in  the  uncurtailed  population  was  available  for 
them.  As  Burt  (reference  3)  points  out,  this  is  the  real  distinction  between  the  directly 
and  indirectly  restricted  variables,  when  correcting  for  multiple  curtailment.  Therefore, 
the  variables  for  which  complete  knowledge  was  available  will  be  considered  to  be  expli- 
citly selected  or  directly  curtailed  variables.  Likewise,  the  variables  for  which  only 
incomplete  data  was  available  will  be  considered  to  be  incidentally  selected  or  indirectly 
curtailed  variables. 


L. 
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Assumptions  of  the  Procedure 


All  variables  are  assumed  to  be  normally  distributed.  The  variables  subject  to 
incidental  selection  are  regarded  as  being  estimated  by  linear  combination  of  the  explicit 
selection  variables.  In  the  school  example,  this  means  that 


FCG  = bj  (VE)+  b2  (AR)+  . 


. + bn(ELI). 


Furthermore,  the  gross  score  weights  applied  to  the  explicit  selection  variables  (i.e.,  b.) 

are  assumed  to  be  the  same  for  the  curtailed  and  uncurtailed  population.  This  assumption 
in  the  univariate  case  (where  FCG  is  estimated  by  only  one  test  score)  would  mean  that  the 
regression  lines  in  the  unrestricted  and  restricted  population  would  have  equal  slopes.  Also, 
it  is  assumed  that  the  errors  of  estimate  ^.e. , the  differences  between  the  predicted  and 
actual  value  of  the  incidentally  selected  variables)  are  the  same  for  both  the  unrestricted 
and  restricted  groups.  Finally,  after  the  effects  of  the  explicitly  selected  variables  are 
partialled  out,  it  is  assumed  that  the  correlations  among  the  variables  subject  to  inci- 
dental selection  in  the  curtailed  population  are  the  same  as  the  analogous  partial  correla- 
tions in  the  uncurtailed  population.  In  a three -variable  case,  say  variables  x,  y,  and  2, 
it  is  assumed  that  for  constant  z the  correlation  between  x and  y is  the  same  in  both 
the  unrestricted  and  restricted  populations.  This  last  assumption  is  examined  in  greater 
detail  in  appendix  C . 

The  Correction  Equations 

The  equations  used  to  correct  for  the  effects  of  range  restriction  will  now  be  presented 
in  general  form,  and  will  be  demonstrated  by  the  example  mentioned  previously. 

Suppose  that  each  member  of  a population  P is  administered  tests  V , V , . . .,  V , 

x A Si 

and  his  score  is  recorded.  Furthermore,  suppose  that  a subpopulation  Q of  P is  obtained 
by  requiring  that  an  individual  in  Q scores  in  a particular  range  on  tests  V , V V . 

In  the  example,  P would  be  the  population  of  all  Marine  recruits,  V.  the  11  ACB-61  sub- 
tests, and  Q the  subpopulation  of  all  Marines  admitted  into  a particular  school.  Also, 
suppose  the  members  of  Q are  administered  tests  V , V V , and  their 

Si  1 i.  aT  Z dTl 

scores  are  recorded.  (In  the  example,  this  would  be  the  final  exam  in  a particular  school.) 

This  would  make  V , , V V the  variables  subject  to  incidental  selection.  If  a 

a+1  a+2  a+t  J 

sample  is  taken  from  the  restricted  population  Q the  correlation  between  tests  V.  and  V , for 

i,  j=l,  2 a-tt,  can  be  estimated.  Denote,  by  C,  the  matrix  of  correlation  coef- 

ficients calculated  from  a sample  of  Q.  This  would  make  C an  (a-tt)  x (a+t)  square 
matrix.  In  the  example,  C was  the  20  x 20  correlation  matrix  of  the  11  ACB-61  subtests, 
the  8 composites,  and  the  FCG  for  any  one  school.  Suppose  it  is  necessary  to  estimate 

what  the  correlation  matrix  of  tests  V,,  V_,  . . .,  V would  be  if  everyone  in  P had 

1 l a-tt 
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also  taken  tests  V ,,  V V . This  correlation  matrix,  not  yet  calculated, 

a+1  a+2  a+t 

of  all  the  variables  in  the  unrestricted  population  will  be  called  £>.  Part  of  6 can  be 

estimated  directly  from  a sample  of  P,  since  everyone  in  P has  taken  tests 

V,,  V , V . Therefore,  the  a xa  submatrix  of  5,  consisting  of  the  correlations 

12  a 

between  the  first  a variables,  will  be  called  D . Still  unknown  is  the  correlation  sub- 

aa 

matrix  between  tests  V^,  Va  and  tests  Va+^,  V&+2»  • • •>  ^a+t*  Denote  this 

a x t submatrix  of  D by  6 , and  let  6 be  its  transpose.  Also  the  correlation  submatrix 
between  the  incidentally  selected  variables  (that  is,  variables  Va+^,  • • •*  Va+t)  is 

A A 

still  unknown.  Denote  this  t x t matrix  by  D . Partition  C in  a similar  manner.  That  is, 

tt 

A A 

let  C be  the  a x t submatrix  of  C consisting  of  the  correlations  of  the  directly  restricted 
variables  and  the  incidentally  selected  variables,  and  let  C be  its  transpose.  Likewise, 
let  C be  the  correlation  submatrix  of  the  last  t variables  (that  is,  the  correlation  matrix 
between  the  incidentally  selected  variables. 

In  order  to  use  a multiple  curtailment  method,  it  is  necessary  to  convert  C and 

into  variance -covariance  matrixes  C and  D , using: 

aa 

Cov  (V.,  V.)  = Q.  • a.  • r.., 
i J i J ij 

where  r denotes  the  correlation  of  V.  with  V.,  Cov  (V.,V.)  stands  for  the  covariance  of 
ij  1 J 1 J a 

V and  V , and  a is  the  standard  deviation  of  V..  For  example,  if  C is  to  be  transformed 
i j l i 

into  C,  then  a estimates  the  standard  deviations  of  V.  in  the  restricted  population.  Like- 
1 x 

wise,  if  the  D matrix  is  being  calculated,  o estimates  the  standard  deviation  in  the  un- 
restricted population.  Then,  the  corrected  variance -covariance  submatrixes  are  calculated 
as  follows: 

D =C  C_1  D 
ta  ta  aa  aa 


n - C + C C (D  - C ) 
tt  ta  aa  at  at 

(see  reference  b). 


The  diagonal  of  the  D matrix  consists  of  the  variance  of  the  directly  restricted 

tt 

variables  in  the  unrestricted  population.  Therefore  (after  taking  the  square  root  of  these 
variances)  to  estimate  the  o.  of  the  incidentally  selected  variables  in  the  unrestricted 

population,  D can  be  converted  into  the  correlation  matrix  D by: 


r = Cov  (V^,  V )/<y.  • ct.  i,  j = 1,  2,  . . a+t. 

In  the  example,  only  the  11x11  matrix  of  correlations  of  the  ACB-61  subtests  were 
included  in  C . This  is  because  the  composite  tests  are  linear  combinations  of  the 

33  ^ 

ACB-61  subtests;  and,  since  C must  be  computed  in  the  correction  equation,  only 

33 

linearly  independent  variables  can  be  included  in  the  set  of  directly  curtailed  variables. 
Figures  2 and  3 show  the  C and  5 matrixes  in  PIBAD  (an  administrative  school) 
divided  into  submatrixes. 


aa 


COMPARISON  WrTH  ANOTHER  TECHNIQUE 


Another  method  for  correcting  for  range  restriction  is  given  by  Thorndike 
(reference  6).  This  method  was  originally  developed  by  Karl  Pearson,  and  is  actually 
just  a special  case  of  the  multiple  curtailment  model  already  discussed.  Thorndike's 
method  assumes  direct  curtailment  on  only  one  variable.  It  has  the  advantage  of  being 
easier  and  requiring  less  information  than  the  equations  used  for  multiple  direct  curtail- 
ment. However,  it  is  neither  as  general  nor,  as  will  be  shown  empirically,  as  accurate 
as  the  multiple  direct  curtailment  model. 

When  only  one  variable  has  been  directly  curtailed,  there  are  two  basic  formulas  for 
correcting  for  range  restriction.  If  variable  3 has  been  restricted,  then  R^i  defined 

to  be  the  corrected  correlation  coefficient  between  variable  1 and  variable  2,  is  given 
by: 


where  r,.  is  the  uncorrected  correlation  coefficient  between  variable  i and  variable  j, 
ij 

§ the  standard  deviation  of  variable  3 in  the  unrestricted  population,  and  S the  standard 

v)  u 

deviation  of  variable  3 in  the  restricted  population.  When  variable  3 and  variable  1 are 
the  same,  the  above  equation  reduces  to: 


In  order  to  check  and  compare  the  two  correction  methods,  a test  was  conducted.  A 
correlation  matrix  of  the  11  ACB-61  subtests  was  calculated  for  each  of  26  Marine  schools 
with  sample  populations  of  225  to  2,400  students. 


A correlation  matrix  of  the  same  11  variables  was  calculated  from  the  sample  of 
60,000  FY-1975  Marines.  As  before,  this  matrix  was  assumed  to  represent  the  "true" 
correlation  matrix  in  the  general  population.  In  order  to  use  the  multiple  curtailment 
method,  the  first  seven  ACB-61  subtests  were  arbitrarily  designated  as  the  directly 
restricted  variables.  That  is,  the  7x7  variance -covariance  matrix  of  VE,  AR,  PA,  Cl, 
MA,  ACS,  and  ARC  (computed  from  the  data  on  the  60,000  Marines)  made  up  the  D 

3lSL 

matrix  of  the  model,  while  the  variance-covariance  matrix  of  all  11  subtests  computed 
from  each  school's  data  made  up  the  C matrix.  The  range  correction  equations  gave 
an  estimate  of  the  4x4  ccrreb’Tin  matrix  of  subtests  subject  to  "incidental"  curtailment, 
and  4x7  and  7x4  submatri  . indirectly  and  directly  curtailed  variables. 

Comparing  these  submatrixe  °.sponding  submatrixes  of  the  true  correlation 

matrix  gave  an  indication  of  t jf  the  range  correction  equations.  Similarly, 

assuming  direct  curtailment  t tn,  < each  school's  correlation  matrix  was  corrected 

for  range  restriction,  using  Thomu ike’s  equations. 

The  matrix  of  correlations  found  by  using  each  of  the  two  methods  was  subtracted 
from  the  matrix  of  true  correlations,  and  the  entries  of  the  three  difference  matrixes 
were  squared  and  summed.  To  fairly  compare  the  two  correction  techniques,  the 
entries  in  the  7x7  submatrix  of  differences  of  correlation  coefficients  of  the  first  seven 
variables  (the  variables  on  which  direct  curtailment  was  assumed  in  the  multiple  curtail- 
ment method)  were  not  squared  and  summed. 

I 

Expressing  this  in  matrix  notation,  let  M(i,  j)  be  the  correlation  matrix  corrected 
by  the  multiple  curtailment  method.  Similarly,  let  S(i,  j)  represent  the  matrix  corrected 
by  assuming  direct  curtailment  on  only  a single  variable,  VE.  Let  T(i,  j)  denote  the 
"true"  correlation  matrix.  Then,  define  the  multiple  variable  index  of  accuracy  (MVA) 
as: 

MVA  = £ ^T(i,j)  - M(i,j)j2  , 

i or  j > 7 

and  define  the  single  variable  index  of  accuracy  (SVA)  as: 

SVA  = £.  ^T(i,j)  - S(i,j)^2 

i or  j > 7 . 
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Table  2 shows  the  SVAs  and  MVAs  for  each  of  the  26  Marine  Corps  schools.  Table  3 
gives  the  names  and  abbreviations  of  the  schools  used  in  table  2.  Tables  4 through  9 
show  the  relevant  matrixes  for  one  school,  AGAFAM. 

With  respect  to  the  test  used  in  this  analysis  to  check  and  compare  the  two  correction 
methods  for  range  restriction,  the  multiple  curtailment  method  is  superior  to  Thorndike's. 
This  conclusion  is  based  only  on  empirical  evidence.  However,  in  many  cases  there  are 
theoretical  reasons  for  assuming  that  more  than  one  variable  is  directly  curtailed.  In 
addition,  the  multiple  curtailment  method  allows  the  analyst  to  use  all  of  the  true  corre- 
lation coefficients  available.  This  last  reason  is  particularly  important  if  one  wants  to 
use  the  corrected  correlation  coefficient  matrix  in  a multiple  regression  or  factor 
analysis. 
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TABLE  2 

SVA  AND  MV A FOR  26  MARINE  CORPS  SCHOOLS 


School 

SVA 

MVA 

SVA /MVA 

FOODSER 

. 1950 

.0738 

2.6 

MP 

.5329  ' 

. 2464 

2.2 

AGAVCC 

.3640 

.2075 

1.8 

AMMOT 

.9894 

. 2067 

4.8 

EEMECH 

. 5970 

. 2107 

2.8 

IWPRPR 

.8314 

.3294 

2.5 

COSPEC 

.4656 

.1365 

3.4 

AGAZ 

1.0887 

.4731 

2.3 

AGMAROC 

.2650 

.1408 

1.9 

(W)SEADU 

.4098 

.1739 

2.4 

(E)SEADU 

.6699 

. 2222 

3.0 

CBTENG 

. 5091 

.2614 

1.9 

FARTYFC 

.6064 

.1802 

3.4 

AGAV 

5.5397 

2.3412 

2.4 

AGAVR 

7.1142 

2.6134 

2.7 

BECF03 

2.9298 

. 5522 

5.3 

BECF10 

1.7401 

.4715 

3.7 

HQ  BAD 

. 5822 

.0994 

5.9 

AGMARAK 

.3993 

.1332 

3.0 

AUTOMEC 

. 4408 

.1153 

3.8 

AGADJ 

.9310 

.1624 

5.7 

AGBHEL 

1.2029 

. 0826 

14.6 

PIBAD 

.1727 

.0768 

2.2 

COMMCTR 

3.2016 

.1876 

17.1 

FRADIO 

1.0343 

.2680 

3.9 

AGAFAM 

.3292 

. 0472 

7.0 
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TABLE  3 


Abbreviation 

FOODSER 

MP 

AGAVCC 

AMMOT 

EEMECH 

IWPRPR 

COSPEC 

AGAZ 

AGMAROC 

(W)SEADU 

(E)SEADU 

CBTENG 

FARTYFC 

AGAV 

AGAVR 

BECF03 

BECF10 

HQBAD 

AGMARAK 

AUTOMEC 

AGADJ 

AGBHEL 

PI  BAD 

COMMCTR 

FRADIO 

AGAFAM 


NAMES  AND  ABBREVIATIONS  FOR 
26  MARINE  CORPS  SCHOOLS 

Title 

Basic  Food  Service  Course 

Law  Enforcement  (MP)  Course 

Aviation  Crash  Crewman  Course 

Ammunition  Storage  Course 

Basic  Engineers  Equipment  Mechanics  Course 

Small  Arms  Repair  Course 

Law  Enforcement  (Corrections  Specialist)  Course 
Aviation  Maintenance  Administration 
Marine  Aviation  Operations  Clerical 

Sea  Duty  Indoctrination  Course  (West  Coast) 

Sea  Duty  Indoctrination  Course  (East  Coast) 

Basic  Combat  Engineer  Course 

Field  Artillery  Fire  Control  Course 

Avionics  Technician  Course 

Avionics  Repair  Course 

Radio  Fundamentals  Course 

Ground  Radio  Repair  Course 

Basic  Administration  Course  (Camp  Pendleton) 
Marine  Aviation  Supply  (Mechanical) 

Basic  Auto  Mechanics  Course 

Aviation  Machinist  Mate  (Jet  Engine)  Course 

Basic  Helicopter  Course 

Basic  Administration  Course  (Parris  Island) 
Communications  Center  Man  Course 
Field  Radio  Operator  Course 
Aviation  Familiarization  Course 
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TABLE 


MATRIX  OF  UNCORRECTED  CORRELATION  COEFFICIENTS 


TABLE  7 


H 
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TABLE  8 


DIFFERENCE  BETWEEN  60,000-MAN  MATRIX  AND 
MATRIX  CORRECTED  FOR  CURTAILMENT  ON  VE 
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APPENDIX  A 


• 1 


L 


FORTRAN  PROGRAM 

This  appendix  is  the  program  used  to  correct  the  effects  of  multiple  curtailment  on 
the  correlation  matrixes  of  84  Marine  Corps  schools. 

The  notation  used  in  the  program  is  slightly  different  from  the  notation  used  in  this 
report.  For  example,  band  C are  denoted  by  DPRIME  and  CPRIME.  Also,  instead  of 
using  "t"  as  a subscript,  "x"  is  used;  for  example,  the  matrix  DXA  in  the  program 
corresponds  to  D in  the  report.  Finally,  since  the  C matrix  is  inverted,  we  used 

tS  33 

the  IBM  routine  IMINV. 


i 


oooooo  oooooo  ooooo  ooooooo  o o o o 


PROGRAM  RANGE 
DIMENSION  LM(li)»MM(ll) 

PEAL  OSIGHA  (271 , CPPIME ( 2(1. 20)  ,CSIGNA  (27  ) , CPRIME  (27,27 > ,CXX  I IS  ,16)  , 

♦ C (27, 27) ,OAA(20,20),HOLD(11,11).CSA(U,11) ,CXA(1<>,11) , CHE  AN (27), 

♦ C4X  (11, 16 >,0X4(16,11)  ,0  AX  (11, 16)  , S (11.11). 

♦VXA(16«11)» 0M INUSC (11, 16), DANS WE F (27, 27 ), OCORR ( 27, 27 >, OXX ( 16, 16 > 

INTEGER  AA,T 


read  IN  'AA'  THE  NUMBER  OF  DIRECTLY  CURTAILED  VARIABLES 
PEAD  967, AA 


RFAO  IN  'OSIGMA  • THE  STANOARO  DEVIATIONS  OF  THE  OIRECTLV  CURTAILED 
RFAO  IN  * DPRIHE  * THE  MATRIX  OF  CORRELATIONS  OF  THE  DIRECTLY 
CURT AI LEO  VARIABLES  IN  THE  GENERAL  POPULATION 


FEAO  11, (0SIGMA(I),I=1,2C> 

DO  782  1=1,20 

PEAD  5A2, (OPRIMEII, J),J=1,20) 


C 


CONVERT  OPRIME  INTO  'OAA'  THE  VARIANCE-COVARIANCE  MATRIX 
CO  783  J=l,20 

7 83  OAA(I,.I)  = OPRIME(I,J)»OSIGMA(I)*OSIGMA(J) 

782  CONTINUE 


READ  IN  'T * THE  MUMPER  OF  INDIRECTLY  CURTAILEO  VARIABLES,  THE  NAME 
OF  THIS  SCHOOL,  ANO  THE  NUMBER  OF  MEN  IN  IT 

1 F.EAO  1005,  T ,NAMEl,  NAME2 
IF  (T.EO. 0)  GO  TO  9999 
PEAC  100,  XNOMEN 
L=T+A A 

PRINT  1004,  NAME1.NAME2 , XNOMEN, L 


READ  IN  THE  MEANS,  STANOARO  DEVIATIONS,  ANO  THE  MATRIX  OF  CORRELATIONS 

FOR  ALL  THE  VARIABLES  (BOTH  CURTAILEO  ANO  UNCURTAILEO)  IN  THfcS  SCHOOL 

PEAO  11,  (CME  AN ( I ) , 1 = 1, L) 

FEAO  11,(CSIGMA(I) .1=1, L» 

CO  924  1=1, L 

924  PEAO  11,(CPRIHE(I,J)*J=1*L> 


uouo  ouooo  o o ooouoouo  oou 


CONVERT  CPRIME  INTO  VARIANCE-COVARIANCE  MATRIX,  »C * 

CO  81,J=1,L 
00  81, 1*1, L 

61  C(I  ,J)=CPRIME (I, J>*CSIGMA<I>»CSIGMA»J> 


SPLIT  C INTO  ITS  COMPONENT  MATRICES,  'CAA*,  'CXX',  »CXA*.  *CAX* 

00  113  J* 1 , AA 
00  113  Is l , AC 
CAA ( I , J! =C  < I , J> 

113  HOLOCI, JI*CAAII,J> 

00  30  J=1,T 
L=AA*J 
00  30  1*1, T 
K=AA»I 

30  CXX(I,J)=CCK,L1 

00  AO  1*1, T 
K=AA*I 

CO  AO  J*l, A A 
CXA (I , J)  = C( K«  J) 

AO  CAX(J,I)=CXA(I,  Jt 


CALCULATE  'OXA'  ANO  ITS  TRANSPOSE,  'OAX* 

FIRST  INVERT  CAA.  PRINT  OUT  THE  DETERMINANT  OR  CAA,  AND  THE  PROOUCT 
OF  CAA  ANO  ITS  INVERSE  TO  BE  SURE  CAA  IS  NONSINGULAR 

CALL  IMINVtCAA.AA,OET,LM,MH) 

PRINT  A57.0ET 
IF(OET.EO.O)  PRINT  129 
00  1001  1=1, AA 
CO  1001.  U=1 , AA 
SUM=0 

CO  1002  K= 1 , A A 

1002  St)M  = HOLO(I,K)  *CAA(K,J»  + SUM 
1001  S (I • J 1=  SUM 

CO  925  1=1, AA 
PRINT  926, CS<I, J), J*1,AA» 

PRINT  1007 

NOW  CALCULATE  OXA 

00  78  1*1, T 
00  78  J= 1 « A A 
SUM*0 

00  178  K=1 , A A 

178  SU’1  = CXA(I,K)*CAACK,J)»SUM 
78  VXA(I,J)  = SUM 

c 


A -3 


00  103  1*1, T 


best  available  com 


ooooo  ooooo  o o ooooo  o ooooooo 


no  IQS  J=1,AA 
SUN*0 

CO  104  K = 1 « A A 

109  SUW=VXA(I«KI*QAA(K,J)  ♦ SUN 

OX  A ( 1 1 J) = SUM 
103  0AX«J,I)=0XA«I,J> 


CALCULATE  OXX 

FIRST  COMPUTE  'OMINUSC'  =0AX-CAX 

00  79  J = 1,T 
CO  79  1=1, AA 

79  CMINUSCII,J»=TIAXU.  J)-CAX(I,  J) 

CO  83, J=1,T 
00  83  1=1, T 
SUM*0 

CO  166  K=l, AA 

166  SUN=VXA  (I,K)*OMINUSC(K,  JUSUM 
83  CXX<I,J»=CXX (I,J»*SUM 


CONSTRUCT  * OANSWER ' FROM  THE  FOUR  SUBMATRICES  OAA,  OXA,  OAX,  OXX 

00  92  1=1, T 
L=I*AA 

00  92  J=1 , AA 
OANSWER (L,J)=OXA(I,JI 

92  DANSWE3 CJ,L )=CANS WE R(L,J> 

00  93  J=1,T 
M=AA+J 
00  93  1=1, T 
L=AA»I 

93  CANSWER(L,M»=OXX(I,J) 

00  91  J=1 , 20 
DO  91  1=1,20 

91  CANSWEP (I , J)  = 0AA (I,  Jl 


COMPUTE  THE  STANDARD  DEVIATIONS  OF  THE  INDIRECTLY  CUP.TAILEO  VARI A9LES 
L=T+AA 

00  95  1=21, L 

95  CSIGMA(I)=SOPT (OANSWER ( I , I J I 


CONVERT  OANSWER  INTO  THE  CORRELATION  MATRIX,  *OCORR* 

CO  9*  J=1«L 
00  99  1=1, L 

9*  OCORRd,  J)=OANSWERCI,  J> / (D3IGMA II) *OSIGMA ! J » ) 


A -4 


WMLABLt  COPV 


o o o o o 


p 


OUTPUT  THE  “EANS,  STANOARO  OEVIATIONS  AMO  CORRECTED  CORRELATION  MATRIX 

PRINT  1008, (C ME AN  (I  1, 1=1, LI 
WRITE ( 1,121, (CMEAN(I), I=1*L) 

PRINT  100ft, iOSIGMAUI ,I=i, LI 
WRITE!  1,121  , COSICMAdl,  I=1,LI 
00  927  1=1, L 

WRITE!  1,111  , 10COPR1I, Jl . J=1,L1 
927  PRINT  1100,  (OCORR!I,J) , J=1,L I 
FNOFILE  1 
GO  TO  1 
C 

11  FORMAT! 8F10. 71 
12  F0R«AT(ftF10.6l 
100  FORMAT (FA • 0 I 
129  FORMAT! 1H  .•SINGULARITY*! 

A57  FORMAT  (lMfl«*TH£  OET  = *, E l<t.  7 1 
5L2  FORMAT! 8F10.7/9F10. 7/LF10. 71 
926  FORMAT(1X»20F6.3I 
987  FORMAT! 121 

1 0 0 A FORMAT  (1HI,*  THIS  IS  SCHOOL  *,2A7,*  WITH  *,FA.O,*  MEN*,//, 

1*  L IS  *,151 

1005  FORMAT! 12, 2A7I 

1006  FORMAT  ! A7 1 

. 1007  FORMA  T < ///I 

1C08  FO=>MAT!1X,IOF12,7I 
1100  FORMAT11X.10F10.7I 
9999  STOP 

. ENO 

SCOPE 

W=  00  LOAO 


BEST  AVAILABLE  COPY 

A -5 
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APPENDIX  B 


APL  PROGRAM 

This  appendix  contains  an  APL  version  of  the  computer  program  used  to  correct  for 
multiple  curtailment.  The  definitions  of  variables  are  given  below;  the  names  in  paren- 
theses are  the  names  of  the  corresponding  variables  as  given  in  the  text. 


ALLCOR 


CLASSCOR 


ASIZE 

XSIZE 


CAA,CXA,CXX 


DXA.DXX 


- matrix  of  correlations  of  directly  restricted 
variables  in  the  general  population  (6^). 

Must  be  supplied. 

- matrix  of  correlations  of  all  variables  in 
restricted  population  (£).  Must  be  supplied. 

- number  of  directly  curtailed  variables  (a). 

- number  of  indirectly  curtailed  variables  (t). 

- variance -covariance  matrix  computed  from 
CLASSCOR  (C). 

- variance-covariance  submatrixes  of  C (C  , 

aa 

CB'  V 

- variance -covariance  matrix  computed  from 

ALLCOR  (D  ). 

aa 

- variance -covariance  matrixes  computed  by  the 
program  (D^,  D^). 


ANSWXA,  ANSWXX  - correlation  matrixes  computed  from  DXA  and  DXX 

“V  V 


SIGC,  SIGD 


vectors  of  standard  deviations  in  restricted  and 
general  populations,  ({a1],  respectively. 


The  actual  correction  equations  are  found  in  statements  28  through  31.  The  rest 
of  the  program  initializes  the  arrays  and  converts  to  and  from  correlation  and  variance- 
covariance  matrixes. 


r 


II 


v o'.arooor  corrrc^tot  at.t.coh 

[13  ASIZF.*\Hf>AT.t.COR) 

[2]  XVTZT*lH(.pCT.AVrrOR)-pALT,rOR) 

[ 3 3 TOrAL+ASlZF+ZSIf.R 

[43  c*(TorAi,.'"OTAr.)po 

[53  MAMAfiTZ?  tAn?.*)f>Q 

[ b 3 avs'j::x*{  xrize , rnrzr ) p o 

[73  A or, <-’xa  * ( xst z v. , a r.i  z x ) p o 

[o3  mcon7r.RT  r o vak-covah  matrices 

[*>3  7*  l 

[10  3 ROVtS+r 

[n3  cn r. : o[ i ; j 3 *c[  j iii-rr.Arrmra  r ;,n xviarz  j 3 xsiacw 3 

[123  .r*,r+l 

[133  +cor,x\{jzToil'Ar,) 

[143  7*7  + 1 

[15  3 *ROVxX(  7ST07/17) 

[163  C/t/l*U.S72ff,/';:7Z.s:)tC 

[17-3  CA7*(ASTZT,  -XSIZr)*fi 

[163 

[103  cxv*(  -xsizs.xsizr  )+r 

[203  •mw  0,1/1 

[213  7*1 

[223  R0V1:.T*T 

[233  COOl : DAAl  I iJl+VAA  [7  ;7]*/. 7.r.C0J?[ 7 ;7  3 x.77<70[  7 3 *.£'.rOO[  7 3 
[243  7*7+1 

[ 253  *C0T.1*\(.AHASIZF.) 

[263  1*1+1 

[27  3 +90V1X  l(7</t  STZ.7) 

[26  3 VXA*CXA  + .*WCAA) 

[203  DXA+V?A+.*DAA 

[303  0.’  'CA  .?*  ( •»/)  XA  ) - CA  X 

[313  D?.x*c:"r+{  vxa + . *mcAZ ) 

[323  tkcovvnn?  nxA  A*n  nrx  ba ox  to  nonnm.AmTO'i  nAmnio*s 
[333  sioniro?Ar.i*nxxtxriz?iXsizFi*o.s 
[343  1*1 

[35  3 ffOI/2 : .1*1 

[36]  coLTf.APMXxii iJl*An!wx::i'T \?]*T>yxi  r;73* (n.TantA.TTZT+nxnir'Prs: 77Z/7+7]  ) 

[371  7*7+1 

[36]  *C0T,2*\lJxXSIZF) 

[393  1*7+1 

[403  *R0W2x\ (IsXS.TZZ) 

[ 41  3 A VSVXXl  XST7.7. ; 1*A  RSVXXl  ; 7.712  7] 

[4? 3 *NOW  no  VXA 

[43]  7*1 

[44]  ff0r'3:7*l 

[45]  COT,  3 : A lISf/XA  [ 7 ; 73  *DZA  [ 7 ; 7 3 * ( VI OV  [ A VI Z 7+7  3 x rj.00  [ J ] ) 

[46]  7*7+1 

[47]  *COr,Rx\(J*ASIZF) 

[46 3 7*7+1 

[49]  *R01/3*\(Tt;XVIZV) 

[50]  • THE  RRRVLTICG  corrected  CORREIA *rov  MATRIX  IS' 

[513  n *(  A T.I.COR , [ 1 37  RSWXA  ) , < ( W RRTXA  ) , [ 1 lAflSWZX ) 
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APPENDIX  C 


j * EQUALITY  OF  PARTIAL  CORRELATIONS 

« One  of  the  key  assumptions  in  deriving  the  range  correction  equations  is  that  after 

the  effects  of  the  explicitly  selected  variables  are  partialled  out,  the  correlations  between 
the  variables  subject  to  incidental  selection  in  the  restricted  population  are  the  same  as 
the  analogous  correlations  in  the  uncurtaiied  population.  This  appendix  indicates  how 
this  assumption  is  valid  in  the  example  of  the  Marine  training  schools. 

Just  as  in  the  COMPARISON  WITH  ANOTHER  TECHNIQUE  section  of  this  report,  the 
first  seven  ACB-6I  subtests  were  arbitrarily  designated  as  the  directly  restricted  variables 
for  four  Marine  training  schools.  Table  C-l  shows  the  partial  correlations  between  the 
other  four  variables  (i.e.,  the  incidentally  selected  variables)  controlling  for  the  effects 
of  the  directly  restricted  variables  in  both  the  school  population  and  that  of  all  FY-1975 
Marine  recruits. 

The  similarity  of  the  first  four  columns  of  table  C-l  and  the  last  column  indicates  that 
this  key  assumption  is  justified  when  several  variables  are  the  basis  of  curtailment. 
Similarly,  table  C-2  shows  the  accuracy  of  the  assumption  that  is  used  when  only  one 
variable  is  the  basis  for  curtailment. 

I • 
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TABLE  C-l 

PARTIAL  CORRELATIONS  BETWEEN  THE  LAST  FOUR  ACB-61 
SUBTESTS  CONTROLLING  FOR  THE  FIRST  SEVEN  SUBTESTS 


Variable  pair 

AGAFAM 

School 

FRADIO  COMMCTR 

PIBAD 

All  Marine 

recruits 
in  FY  1975 

ELI  with  A I 

.32 

.33 

.31 

.37 

.34 

ELI  with  SM 

.22 

.24 

.20 

.27 

.24 

ELI  with  GIT 

.20 

.08 

.13 

.20 

.18 

A I with  SM 

.43 

.44 

.45 

.44 

.42 

A I with  GIT 

.34 

.32 

.39 

.29 

.31 

SM  with  GIT 

.29 

.24 

.28 

.26 

.23 

Li 


TABLE  C-2 


PARTIAL  CORRELATIONS  BETWEEN  THE  LAST  FOUR  ACB-61 
SUBTESTS  CONTROLLING  FOR  THE  FIRST  SUBTEST 

All  Marine 

School  recruits 


Variable  pair 

AGAFAM 

FRADIO 

COMMCTR 

PIBAD 

in  FY  1975 

ELI  with  A I 

.41 

.37 

.42 

.55 

.45 

ELI  with  SM 

.33 

.29 

.35 

.53 

.38 

ELI  with  GIT 

.28 

.13 

.24 

.47 

.29 

A I with  SM 

.53 

.53 

.55 

.64 

.56 

A I with  GIT 

.43 

.39 

.47 

.52 

.43 

SM  with  GIT 

.40 

.35 

.39 

.61 

.39 

C-2 


